Convolutional Neural Networks (CNN)


By Prof. Seungchul Lee
http://iai.postech.ac.kr/
Industrial AI Lab at POSTECH

1. Convolution

1.1. 1D Convolution


1.1.1. Denoising

In [1]:
import numpy as np
import cv2
from scipy import signal
import matplotlib.pyplot as plt
%matplotlib inline
In [2]:
# piecewise smooth signal
N = 50
n = np.arange(N)

v = np.hstack([np.ones([1, np.int(N/2)]), -np.ones([1, np.int(N/2)])])
x = np.sin(np.pi/N*n)*v
xn = x + 0.1*np.random.randn(1, N)

# construct moving average filter impulse response of length M
M = 5
h = np.zeros([1, M])
h[0:M] = 1/M

# convolve noisy signal with impulse response
y = signal.convolve(xn, h)

# plot
plt.figure(figsize=(12,8))
plt.subplot(2,2,1)
plt.stem(x.T, linefmt = 'b', markerfmt = 'w', basefmt = 'r')
plt.xlim([-1, N])
plt.ylim([-1.5, 1.5])
plt.title('piecewise smooth signal')
plt.subplot(2,2,2)
plt.stem(xn.T, linefmt = 'b', markerfmt = 'w', basefmt = 'r')
plt.xlim([-1, N])
plt.ylim([-1.5, 1.5])
plt.title('piecewise smooth signal + noise')
plt.subplot(2,2,3)
plt.stem(h.T, linefmt = 'b', markerfmt = 'w', basefmt = 'r')
plt.xlim([-1, N])
plt.ylim([-1.5, 1.5])
plt.title('impulse response')
plt.subplot(2,2,4)
plt.stem(y.T, linefmt = 'b', markerfmt = 'w', basefmt = 'r')
plt.xlim([-1, N])
plt.ylim([-1.5, 1.5])
plt.title('convoluted output')
plt.show()
c:\users\juwonna7\appdata\local\programs\python\python36\lib\site-packages\scipy\signal\signaltools.py:491: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  return x[reverse].conj()
c:\users\juwonna7\appdata\local\programs\python\python36\lib\site-packages\scipy\signal\signaltools.py:251: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  in1zpadded[sc] = in1.copy()
c:\users\juwonna7\appdata\local\programs\python\python36\lib\site-packages\ipykernel_launcher.py:20: UserWarning: In Matplotlib 3.3 individual lines on a stem plot will be added as a LineCollection instead of individual lines. This significantly improves the performance of a stem plot. To remove this warning and switch to the new behaviour, set the "use_line_collection" keyword argument to True.
c:\users\juwonna7\appdata\local\programs\python\python36\lib\site-packages\ipykernel_launcher.py:25: UserWarning: In Matplotlib 3.3 individual lines on a stem plot will be added as a LineCollection instead of individual lines. This significantly improves the performance of a stem plot. To remove this warning and switch to the new behaviour, set the "use_line_collection" keyword argument to True.
c:\users\juwonna7\appdata\local\programs\python\python36\lib\site-packages\ipykernel_launcher.py:30: UserWarning: In Matplotlib 3.3 individual lines on a stem plot will be added as a LineCollection instead of individual lines. This significantly improves the performance of a stem plot. To remove this warning and switch to the new behaviour, set the "use_line_collection" keyword argument to True.
c:\users\juwonna7\appdata\local\programs\python\python36\lib\site-packages\ipykernel_launcher.py:35: UserWarning: In Matplotlib 3.3 individual lines on a stem plot will be added as a LineCollection instead of individual lines. This significantly improves the performance of a stem plot. To remove this warning and switch to the new behaviour, set the "use_line_collection" keyword argument to True.

1.1.2. Edge Detection

In [3]:
# haar wavelet edge detector
M = 2
h = np.zeros([1, M])
h[0,0:1] = -1/M
h[0,1:2] = 1/M

# convolve noisy signal with impulse response
y = signal.convolve(xn, h)

# plot
plt.figure(figsize=(12,8))
plt.subplot(2,2,1)
plt.stem(x.T, linefmt = 'b', markerfmt = 'w', basefmt = 'r')
plt.xlim([-1, N])
plt.ylim([-1.5, 1.5])
plt.title('piecewise smooth signal')
plt.subplot(2,2,2)
plt.stem(xn.T, linefmt = 'b', markerfmt = 'w', basefmt = 'r')
plt.xlim([-1, N])
plt.ylim([-1.5, 1.5])
plt.title('piecewise smooth signal + noise')
plt.subplot(2,2,3)
plt.stem(h.T, linefmt = 'b', markerfmt = 'w', basefmt = 'r')
plt.xlim([-1, N])
plt.ylim([-1.5, 1.5])
plt.title('impulse response')
plt.subplot(2,2,4)
plt.stem(y.T, linefmt = 'b', markerfmt = 'w', basefmt = 'r')
plt.xlim([-1, N])
plt.ylim([-1.5, 1.5])
plt.title('convoluted output')
plt.show()
c:\users\juwonna7\appdata\local\programs\python\python36\lib\site-packages\ipykernel_launcher.py:13: UserWarning: In Matplotlib 3.3 individual lines on a stem plot will be added as a LineCollection instead of individual lines. This significantly improves the performance of a stem plot. To remove this warning and switch to the new behaviour, set the "use_line_collection" keyword argument to True.
  del sys.path[0]
c:\users\juwonna7\appdata\local\programs\python\python36\lib\site-packages\ipykernel_launcher.py:18: UserWarning: In Matplotlib 3.3 individual lines on a stem plot will be added as a LineCollection instead of individual lines. This significantly improves the performance of a stem plot. To remove this warning and switch to the new behaviour, set the "use_line_collection" keyword argument to True.
c:\users\juwonna7\appdata\local\programs\python\python36\lib\site-packages\ipykernel_launcher.py:23: UserWarning: In Matplotlib 3.3 individual lines on a stem plot will be added as a LineCollection instead of individual lines. This significantly improves the performance of a stem plot. To remove this warning and switch to the new behaviour, set the "use_line_collection" keyword argument to True.
c:\users\juwonna7\appdata\local\programs\python\python36\lib\site-packages\ipykernel_launcher.py:28: UserWarning: In Matplotlib 3.3 individual lines on a stem plot will be added as a LineCollection instead of individual lines. This significantly improves the performance of a stem plot. To remove this warning and switch to the new behaviour, set the "use_line_collection" keyword argument to True.

1.2. Images


1.3. Convolution on Image (= Convolution in 2D)

Filter (or Kernel)

  • Modify or enhance an image by filtering
  • Filter images to emphasize certain features or remove other features
  • Filtering includes smoothing, sharpening and edge enhancement

  • Discrete convolution can be viewed as element-wise multiplication by a matrix


How to find the right Kernels

  • We learn many different kernels that make specific effect on images

  • Let’s apply an opposite approach

  • We are not designing the kernel, but are learning the kernel from data

  • Can learn feature extractor from data using a deep learning framework

1.3.1. Denoising

In [4]:
# noise image

img = cv2.imread('.\data_files\lena_sigma25.png', 0)
print(img.shape)

plt.figure(figsize = (6,6))
plt.imshow(img, 'gray')
plt.axis('off')
plt.show()
(512, 512)
In [5]:
# smoothing or noise reduction

M = np.ones([3,3])/9

img_conv = signal.convolve2d(img, M, 'same')

print(img_conv.shape)

plt.figure(figsize = (12,6))
plt.subplot(1,2,1)
plt.imshow(img, 'gray')
plt.title('Noisy Image')
plt.axis('off')
plt.subplot(1,2,2)
plt.imshow(img_conv, 'gray')
plt.title('Smoothed Image')
plt.axis('off')
plt.show()
(512, 512)

1.3.2. Edge Detection

In [6]:
# original image

img = cv2.imread('.\data_files\lena.png', 0)
print(img.shape)

plt.figure(figsize = (6,6))
plt.imshow(img, 'gray')
plt.axis('off')
plt.show()
(512, 512)
In [7]:
# guess what kind of image will be produced after convolution

Mv  = np.array([[1, 0, -1],
                [1, 0, -1],
                [1, 0, -1]])

img_conv = signal.convolve2d(img, Mv, 'same')

plt.figure(figsize = (12,6))
plt.imshow(img_conv, 'gray')
plt.axis('off')
plt.show()       
In [8]:
# guess what kind of image will be produced after convolution

Mh = np.array([[1, 1, 1],
               [0, 0, 0],
               [-1, -1, -1]])

img_conv = signal.convolve2d(img, Mh, 'same')

plt.figure(figsize = (6,6))
plt.imshow(img_conv, 'gray')
plt.axis('off')
plt.show()
In [9]:
M = (Mv + Mh)/2

img_conv = signal.convolve2d(img, M, 'same')

plt.figure(figsize = (6,6))
plt.imshow(img_conv, 'gray')
plt.axis('off')
plt.show()

2. Convolutional Neural Networks (CNN)

2.1. Motivation: Learning Visual Features


The bird occupies a local area and looks the same in different parts of an image. We should construct neural networks which exploit these properties.



  • ANN structure for object detecion in image

    • does not seem the best
    • did not make use of the fact that we are dealing with images
    • Spatial organization of the input is destroyed by flattening


  • Convolution Mask + Neural Network
    • Utilize spatial information of the image



  • Locality: objects tend to have a local spatial support
    • fully and convolutionally connected layer $\rightarrow$ locally and convolutionally connected layer



  • Translation invariance: object appearance is independent of location
    • Weight sharing: untis connected to different locations have the same weights
    • We are not designing the kernel, but are learning the kernel from data
    • i.e. We are learning visual feature extractor from data


In [10]:
%%html
<center><iframe src="https://www.youtube.com/embed/0Hr5YwUUhr0?rel=0" 
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>

2.2. Convolutional Operator

Convolution of CNN

  • Local connectivity
  • Weight sharing
  • Typically have sparse interactions

(Deep) Artificial Neural Networks

  • Universal function approximator
    • Linear connected networks
    • Simple nonlinear neurons
  • Hidden layers
    • Autonomous feature learning


(Deep) Convolutional Neural Networks

  • Structure
    • Weight sharing
    • Local connectivity
  • Optimization
    • Smaller searching space


  • Convolutional Neural Networks
    • Simply neural networks that use the convolution in place of general matrix multiplication in at least one of their layers
In [11]:
%%html
<center><iframe src="https://www.youtube.com/embed/ISHGyvsT0QY?rel=0" 
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>
  • Multiple channels


  • Multiple kernels


  • Convolution Shape
    • Activation or feature maps
      • input $(W^i, H^i, C)$
      • output $(W^o, H^o, D)$
    • Kernel of filter $(w, h, C, D)$
      • $w \times h$ kernel size
      • $C$ input channels
      • $D$ output channels
    • Number of parameters (due to bias)
      • $(w \times h \times \times C + 1) \times D$
  • The kernel is not swiped across channels, just across rows and columns.
  • Note that a convolution preserves the signal support structure.
  • A 1D signal is converted into a 1D signal, a 2D signal into a 2D, and neighboring parts of the input signal influence neighboring parts of the output signal.
  • A 3D convolution can be used if the channel index has some metric meaning, such as time for a series of grayscale video frames. Otherwise swiping across channels makes no sense.
  • We usually refer to one of the channels generated by a convolution layer as an activation map.
  • The sub-area of an input map that influences a component of the output as the receptive field of the latter.
  • In the context of convolutional networks, a standard linear layer is called a fully connected layer since every input influences every output.
  • (You can view an animated demo on this page.)

2.3 Stride and Padding

  • Strides: increment step size for the convolution operator
    • Reduces the size of the output map
  • No stride and no padding


  • Stride example with kernel size 3×3 and a stride of 2


  • Padding: artificially fill borders of image
    • Useful to keep spatial dimension constant across filters
    • Useful with strides and large receptive fields
    • Usually fill with 0s


In [12]:
%%html
<center><iframe src="https://www.youtube.com/embed/W4xtf8LTz1c?rel=0" 
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>

2.4. Nonlinear Activation Function


2.5. Pooling

  • Compute a maximum value in a sliding window (max pooling)
    • Reduce spatial resolution for faster computation
    • Achieve invariance to any permutation inside one of the cell


  • Pooling size : $2\times2$ for example
  • Pooling in multiple channel


  • Max pooling introduces invariances
In [13]:
%%html
<center><iframe src="https://www.youtube.com/embed/FG7M9tWH2nQ?rel=0" 
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>

2.6. Inside the Convolution Layer

  • First, the layer performs several convolutions to produce a set of linear activations
  • Second, each linear activation is running through a nonlinear activation function
  • Third, use pooling to modify the output of the layer further


2.7. CNN for Classification

  • CONV and POOL layers output high-level features of input
  • Fully connected layer uses these features for classifying input image
  • Express output as probability of image belonging to a particular class



In [14]:
%%html
<center><iframe src="https://www.youtube.com/embed/utOv-BKI_vo?rel=0" 
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>        

ConvNet performs better with the same number of parameters, due to its use of a prior knowledge about iamges.

3. Lab: CNN with TensorFlow

  • MNIST example
  • To classify handwritten digits



In [15]:
%%html
<center><iframe src="https://www.youtube.com/embed/z6k_RMKExlQ?start=5150&end=6132?rel=0" 
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>

3.1. Import Library and Load MNIST Data

In [16]:
# Import Library
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
WARNING: Logging before flag parsing goes to stderr.
W0129 00:22:58.847078 29924 deprecation.py:323] From <ipython-input-16-8670781c4319>:7: read_data_sets (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
W0129 00:22:58.848075 29924 deprecation.py:323] From c:\users\juwonna7\appdata\local\programs\python\python36\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\mnist.py:260: maybe_download (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Please write your own downloading logic.
W0129 00:22:58.850069 29924 deprecation.py:323] From c:\users\juwonna7\appdata\local\programs\python\python36\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\base.py:252: _internal_retry.<locals>.wrap.<locals>.wrapped_fn (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version.
Instructions for updating:
Please use urllib or similar directly.
W0129 00:23:01.527924 29924 deprecation.py:323] From c:\users\juwonna7\appdata\local\programs\python\python36\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\mnist.py:262: extract_images (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use tf.data to implement this functionality.
Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting MNIST_data/train-images-idx3-ubyte.gz
W0129 00:23:02.009635 29924 deprecation.py:323] From c:\users\juwonna7\appdata\local\programs\python\python36\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\mnist.py:267: extract_labels (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use tf.data to implement this functionality.
W0129 00:23:02.011631 29924 deprecation.py:323] From c:\users\juwonna7\appdata\local\programs\python\python36\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\mnist.py:110: dense_to_one_hot (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use tf.one_hot on tensors.
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
W0129 00:23:03.018936 29924 deprecation.py:323] From c:\users\juwonna7\appdata\local\programs\python\python36\lib\site-packages\tensorflow\contrib\learn\python\learn\datasets\mnist.py:290: DataSet.__init__ (from tensorflow.contrib.learn.python.learn.datasets.mnist) is deprecated and will be removed in a future version.
Instructions for updating:
Please use alternatives such as official/mnist/dataset.py from tensorflow/models.
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz

3.2. Define the CNN Structure

Convolution layers

  • First, the layer performs several convolutions to produce a set of linear activations
  • Second, each linear activation is running through a nonlinear activation function
  • Third, use pooling to modify the output of the layer further

Fully connected layers

  • Simple multi-layer perceptrons



In [17]:
# input layer
input_h = 28 # Input height
input_w = 28 # Input width
input_ch = 1 # Input channel : Gray scale
# (None, 28, 28, 1)

# First convolution layer
k1_h = 3
k1_w = 3
k1_ch = 32 
p1_h = 2
p1_w = 2
# (None, 14, 14 ,32)

# Second convolution layer
k2_h = 3
k2_w = 3
k2_ch = 64
p2_h = 2
p2_w = 2
# (None, 7, 7 ,64)

## Fully connected: flatten the features -> (None, 7*7*64)
conv_result_size = int(input_h/(p1_h*p2_h)) * int(input_w/(p1_w*p2_w)) * k2_ch
n_hidden = 100
n_output = 10

3.3. Define Weights, Biases and Placeholder

  • Define parameters based on predefined layer size
  • Initialize with normal distribution with $\mu = 0$ and $\sigma = 0.1$
In [18]:
# kernel size: [kernel_height, kernel_width, input_ch, output_ch]
weights = {
    'conv1' : tf.Variable(tf.random_normal([k1_h, k1_w, input_ch, k1_ch], stddev = 0.1)),
    'conv2' : tf.Variable(tf.random_normal([k2_h, k2_w, k1_ch, k2_ch], stddev = 0.1)),
    'hidden' : tf.Variable(tf.random_normal([conv_result_size, n_hidden], stddev = 0.1)),
    'output' : tf.Variable(tf.random_normal([n_hidden, n_output], stddev = 0.1))
}

# bias size: [output_ch] or [neuron_size]
biases = {
    'conv1' : tf.Variable(tf.random_normal([k1_ch], stddev = 0.1)),
    'conv2' : tf.Variable(tf.random_normal([k2_ch], stddev = 0.1)),
    'hidden' : tf.Variable(tf.random_normal([n_hidden], stddev = 0.1)),
    'output' : tf.Variable(tf.random_normal([n_output], stddev = 0.1))
}

# input layer: [batch_size, image_height, image_width, channels]
# output layer: [batch_size, class_size]
x = tf.placeholder(tf.float32, [None, input_h, input_w, input_ch])
y = tf.placeholder(tf.float32, [None, n_output])

3.4. Build a CNN Model

First, the layer performs several convolutions to produce a set of linear activations




tf.nn.conv2d(input, filter, strides, padding)

    input = tensor of shape [None, input_h, input_w, input_ch]
    filter = tensor of shape [k_h, k_w, input_ch, output_ch]
    strides = [1, s_h, s_w, 1]
    padding = 'SAME'


  • filter size

    • the field of view of the convolution.
  • stride

    • the step size of the kernel when traversing the image.
  • padding

    • how the border of a sample is handled.
    • A padded convolution will keep the spatial output dimensions equal to the input, whereas unpadded convolutions will crop away some of the borders if the kernel is larger than 1.
    • 'SAME' : zero padding
  • input and output channels
    • A convolutional layer takes a certain number of input channels ($C$) and calculates a specific number of output channels ($D$).

Second, each linear activation is running through a nonlinear activation function



Third, use a pooling to modify the output of the layer further

  • Compute a maximum value in a sliding window (max pooling)



tf.nn.max_pool(value, ksize, strides, padding)

    value = tensor of shape [None, input_h, input_w, input_ch]
    ksize = [1, p_h, p_w, 1]
    strides = [1, p_h, p_w, 1]
    padding = 'VALID'


  • ksize

    • The size of the window for each dimension of the input tensor
  • strides

    • The stride of the sliding window for each dimension of the input tensor.
  • padding

    • 'VALID' : No padding

Dense (fully connected) layer

  • Input is typically in a form of flattened features

  • Then, apply softmax to multiclass classification problems

  • The output of the softmax function is equivalent to a categorical probability distribution, it tells you the probability that any of the classes are true.



In [19]:
# [batch, height, width, channels]

def net(x, weights, biases):
    # First convolution layer
    conv1 = tf.nn.conv2d(x, 
                         weights['conv1'], 
                         strides = [1, 1, 1, 1], 
                         padding = 'SAME')
    conv1 = tf.nn.relu(tf.add(conv1, biases['conv1']))
    maxp1 = tf.nn.max_pool(conv1, 
                           ksize = [1, p1_h, p1_w, 1], 
                           strides = [1, p1_h, p1_w, 1], 
                           padding = 'VALID')
    
    # Second convolution layer
    conv2 = tf.nn.conv2d(maxp1, 
                         weights['conv2'], 
                         strides = [1, 1, 1, 1], 
                         padding = 'SAME')
    conv2 = tf.nn.relu(tf.add(conv2, biases['conv2']))
    maxp2 = tf.nn.max_pool(conv2, 
                           ksize = [1, p2_h, p2_w, 1], 
                           strides = [1, p2_h, p2_w, 1], 
                           padding = 'VALID')

    maxp2_flatten = tf.reshape(maxp2, [-1, conv_result_size])
    
    # Fully connected
    hidden = tf.add(tf.matmul(maxp2_flatten, weights['hidden']), biases['hidden'])
    hidden = tf.nn.relu(hidden)
    output = tf.add(tf.matmul(hidden, weights['output']), biases['output'])
    
    return output

3.5. Define Loss, and Optimizer

Loss

  • Classification: Cross entropy
    • Equivalent to applying logistic regression
$$ -\frac{1}{m}\sum_{i=1}^{m}y^{(i)}\log(h_{\theta}\left(x^{(i)}\right)) + (1-y^{(i)})\log(1-h_{\theta}\left(x^{(i)}\right)) $$

Optimizer

  • GradientDescentOptimizer
  • AdamOptimizer: the most popular optimizer
In [20]:
LR = 0.0001

pred = net(x, weights, biases)
loss = tf.nn.softmax_cross_entropy_with_logits(labels = y, logits = pred)
loss = tf.reduce_mean(loss)

optm = tf.train.AdamOptimizer(LR).minimize(loss)
W0129 00:23:03.345064 29924 deprecation.py:323] From <ipython-input-20-7c1e4794f741>:4: softmax_cross_entropy_with_logits (from tensorflow.python.ops.nn_ops) is deprecated and will be removed in a future version.
Instructions for updating:

Future major versions of TensorFlow will allow gradients to flow
into the labels input on backprop by default.

See `tf.nn.softmax_cross_entropy_with_logits_v2`.

3.6. Optimize

  • Define hyper parameters for training CNN

    • n_batch : mini-batch size for stochastic gradient descent
    • n_iter : the number of training steps
    • n_prt : print loss for every n_prt iteration
In [21]:
n_batch = 50
n_iter = 2500
n_prt = 250

sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)

loss_record_train = []
loss_record_test = []
for epoch in range(n_iter):
    train_x, train_y = mnist.train.next_batch(n_batch)
    train_x = np.reshape(train_x, [-1, input_h, input_w, input_ch])
    sess.run(optm, feed_dict = {x: train_x,  y: train_y})
    
    if epoch % n_prt == 0:
        test_x, test_y = mnist.test.next_batch(n_batch)
        test_x = np.reshape(test_x, [-1, input_h, input_w, input_ch])
        c1 = sess.run(loss, feed_dict = {x: train_x, y: train_y})
        c2 = sess.run(loss, feed_dict = {x: test_x, y: test_y})
        loss_record_train.append(c1)
        loss_record_test.append(c2)
        print ("Iter : {}".format(epoch))
        print ("Cost : {}".format(c1))

plt.figure(figsize = (10,8))
plt.plot(np.arange(len(loss_record_train))*n_prt, loss_record_train, label = 'train')
plt.plot(np.arange(len(loss_record_test))*n_prt, loss_record_test, label = 'test')
plt.xlabel('iteration', fontsize = 15)
plt.ylabel('loss', fontsize = 15)
plt.legend(fontsize = 12)
plt.ylim([0, np.max(loss_record_train)])
plt.show()
Iter : 0
Cost : 2.885744571685791
Iter : 250
Cost : 0.6508976817131042
Iter : 500
Cost : 0.360478013753891
Iter : 750
Cost : 0.14413484930992126
Iter : 1000
Cost : 0.1944809854030609
Iter : 1250
Cost : 0.24603687226772308
Iter : 1500
Cost : 0.12560731172561646
Iter : 1750
Cost : 0.07864823192358017
Iter : 2000
Cost : 0.15150028467178345
Iter : 2250
Cost : 0.05812130123376846

3.7. Test or Evaluate

In [22]:
test_x, test_y = mnist.test.next_batch(100)

my_pred = sess.run(pred, feed_dict = {x: test_x.reshape(-1, 28, 28, 1)})
my_pred = np.argmax(my_pred, axis = 1)

labels = np.argmax(test_y, axis = 1)

accr = np.mean(np.equal(my_pred, labels))
print("Accuracy : {}%".format(accr*100))
Accuracy : 99.0%
In [23]:
test_x, test_y = mnist.test.next_batch(1)
logits = sess.run(tf.nn.softmax(pred), feed_dict = {x: test_x.reshape(-1, 28, 28, 1)})
predict = np.argmax(logits)

plt.figure(figsize = (12,5))
plt.subplot(1,2,1)
plt.imshow(test_x.reshape(28, 28), 'gray')
plt.axis('off')
plt.subplot(1,2,2)
plt.stem(logits.ravel())
plt.show()

np.set_printoptions(precision = 2, suppress = True)
print('Prediction : {}'.format(predict))
print('Probability : {}'.format(logits.ravel()))
c:\users\juwonna7\appdata\local\programs\python\python36\lib\site-packages\ipykernel_launcher.py:10: UserWarning: In Matplotlib 3.3 individual lines on a stem plot will be added as a LineCollection instead of individual lines. This significantly improves the performance of a stem plot. To remove this warning and switch to the new behaviour, set the "use_line_collection" keyword argument to True.
  # Remove the CWD from sys.path while we load stuff.
Prediction : 5
Probability : [0. 0. 0. 0. 0. 1. 0. 0. 0. 0.]

3.8. CNN Implemented in an Embedded System

In [24]:
%%html
<center><iframe src="https://www.youtube.com/embed/baPLXhjslL8?rel=0" 
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>

4. tf.layers

  • high-level API
In [25]:
input_h = 28
input_w = 28
input_ch = 1

k1_h = 3
k1_w = 3
k1_ch = 32 
p1_h = 2
p1_w = 2

k2_h = 3
k2_w = 3
k2_ch = 64
p2_h = 2
p2_w = 2

conv_result_size = int(input_h/(p1_h*p2_h)) * int(input_w/(p1_w*p2_w)) * k2_ch
n_hidden = 100
n_output = 10
In [26]:
x = tf.placeholder(tf.float32, [None, input_h, input_w, input_ch])
y = tf.placeholder(tf.float32, [None, n_output])
In [27]:
def net(x):
    ## First convolution layer
    conv1 = tf.layers.conv2d(inputs = x, 
                             filters = 32, 
                             kernel_size = [3, 3], 
                             padding = "SAME", 
                             activation = tf.nn.relu,
                             kernel_initializer = tf.initializers.random_normal())
    maxp1 = tf.layers.max_pooling2d(inputs = conv1, 
                                    pool_size = [2, 2], 
                                    strides = 2)
    
    ## Second convolution layer
    conv2 = tf.layers.conv2d(inputs = maxp1, 
                             filters = 64, 
                             kernel_size = [3, 3], 
                             padding = "SAME", 
                             activation = tf.nn.relu,
                             kernel_initializer = tf.initializers.random_normal())
    maxp2 = tf.layers.max_pooling2d(inputs = conv2, 
                                    pool_size = [2, 2], 
                                    strides = 2)

    maxp2_re = tf.reshape(maxp2, [-1, conv_result_size])
    
    ### Fully connected (= dense connected)
    hidden = tf.layers.dense(inputs = maxp2_re, 
                             units = n_hidden, 
                             activation = tf.nn.relu)
    output = tf.layers.dense(inputs = hidden, 
                             units = n_output)
    
    return output
In [28]:
LR = 0.0001

pred = net(x)
loss = tf.nn.softmax_cross_entropy_with_logits(labels = y, logits = pred)
loss = tf.reduce_mean(loss)

optm = tf.train.AdamOptimizer(LR).minimize(loss)

n_batch = 50
n_iter = 2500
n_prt = 250

sess = tf.Session()
init = tf.global_variables_initializer()
sess.run(init)

loss_record_train = []
loss_record_test = []
for epoch in range(n_iter):
    train_x, train_y = mnist.train.next_batch(n_batch)
    train_x = np.reshape(train_x, [-1, input_h, input_w, input_ch])
    sess.run(optm, feed_dict = {x: train_x,  y: train_y})
    
    if epoch % n_prt == 0:
        test_x, test_y = mnist.test.next_batch(n_batch)
        test_x = np.reshape(test_x, [-1, input_h, input_w, input_ch])
        c1 = sess.run(loss, feed_dict = {x: train_x, y: train_y})
        c2 = sess.run(loss, feed_dict = {x: test_x, y: test_y})
        loss_record_train.append(c1)
        loss_record_test.append(c2)
        print ("Iter : {}".format(epoch))
        print ("Cost : {}".format(c1))

plt.figure(figsize = (10,8))
plt.plot(np.arange(len(loss_record_train))*n_prt, loss_record_train, label = 'train')
plt.plot(np.arange(len(loss_record_test))*n_prt, loss_record_test, label = 'test')
plt.xlabel('iteration', fontsize = 15)
plt.ylabel('loss', fontsize = 15)
plt.legend(fontsize = 12)
plt.ylim([0, np.max(loss_record_train)])
plt.show()
W0129 00:23:18.087638 29924 deprecation.py:323] From <ipython-input-27-bbf6f4dd3ca0>:8: conv2d (from tensorflow.python.layers.convolutional) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.keras.layers.Conv2D` instead.
W0129 00:23:18.328211 29924 deprecation.py:323] From <ipython-input-27-bbf6f4dd3ca0>:11: max_pooling2d (from tensorflow.python.layers.pooling) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.MaxPooling2D instead.
W0129 00:23:18.457864 29924 deprecation.py:323] From <ipython-input-27-bbf6f4dd3ca0>:29: dense (from tensorflow.python.layers.core) is deprecated and will be removed in a future version.
Instructions for updating:
Use keras.layers.dense instead.
W0129 00:23:18.459859 29924 deprecation.py:506] From c:\users\juwonna7\appdata\local\programs\python\python36\lib\site-packages\tensorflow\python\ops\init_ops.py:1251: calling VarianceScaling.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Iter : 0
Cost : 25.75640106201172
Iter : 250
Cost : 0.6082043647766113
Iter : 500
Cost : 0.2860710024833679
Iter : 750
Cost : 0.15792465209960938
Iter : 1000
Cost : 0.08461542427539825
Iter : 1250
Cost : 0.15327845513820648
Iter : 1500
Cost : 0.16603229939937592
Iter : 1750
Cost : 0.0775686576962471
Iter : 2000
Cost : 0.003426922485232353
Iter : 2250
Cost : 0.14362271130084991
In [29]:
test_x, test_y = mnist.test.next_batch(100)

my_pred = sess.run(pred, feed_dict = {x: test_x.reshape(-1, 28, 28, 1)})
my_pred = np.argmax(my_pred, axis = 1)

labels = np.argmax(test_y, axis = 1)

accr = np.mean(np.equal(my_pred, labels))
print("Accuracy : {}%".format(accr*100))
Accuracy : 97.0%
In [30]:
test_x, test_y = mnist.test.next_batch(1)
logits = sess.run(tf.nn.softmax(pred), feed_dict = {x: test_x.reshape(-1, 28, 28, 1)})
predict = np.argmax(logits)

plt.figure(figsize = (12,5))
plt.subplot(1,2,1)
plt.imshow(test_x.reshape(28, 28), 'gray')
plt.axis('off')
plt.subplot(1,2,2)
plt.stem(logits.ravel())
plt.show()

np.set_printoptions(precision = 2, suppress = True)
print('Prediction : {}'.format(predict))
print('Probability : {}'.format(logits.ravel()))
c:\users\juwonna7\appdata\local\programs\python\python36\lib\site-packages\ipykernel_launcher.py:10: UserWarning: In Matplotlib 3.3 individual lines on a stem plot will be added as a LineCollection instead of individual lines. This significantly improves the performance of a stem plot. To remove this warning and switch to the new behaviour, set the "use_line_collection" keyword argument to True.
  # Remove the CWD from sys.path while we load stuff.
Prediction : 9
Probability : [0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]

5. Other Tutorials

  • Neural Network 3D Simulation
In [31]:
%%html
<center><iframe src="https://www.youtube.com/embed/3JQ3hYko51Y?rel=0" 
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>
In [32]:
%%html
<center><iframe src="https://www.youtube.com/embed/FTr3n7uBIuE?rel=0" 
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>
In [33]:
%%html
<center><iframe src="https://www.youtube.com/embed/LxfUGhug-iQ?rel=0" 
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>
In [34]:
%%html
<center><iframe src="https://www.youtube.com/embed/bNb2fEVKeEo?rel=0" 
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>
In [35]:
%%html
<center><iframe src="https://www.youtube.com/embed/NVH8EYPHi30?rel=0" 
width="560" height="315" frameborder="0" allowfullscreen></iframe></center>